Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
This paper investigates methods for training parameterized functions for guiding state-space search algorithms. Existing work commonly generates data for training such guiding functions by solving problem instances while leveraging the current version of the guiding function. As a result, as training progresses, the guided search algorithm can solve more difficult instances that are, in turn, used to further train the guiding function. These methods assume that a set of problem instances of varied difficulty is provided. Since previous work was not designed to distinguish the instances that the search algorithm can solve from those that cannot be solved with the current guiding function, the algorithm commonly wastes time attempting and failing to solve many of these instances. In this paper, we improve upon these training methods by generating a curriculum for learning the guiding function that directly addresses this issue. Namely, we propose and evaluate a Teacher-Student Curriculum (TSC) approach where the teacher is an evolutionary strategy that attempts to generate problem instances of ``correct difficulty'' and the student is a guided search algorithm utilizing the current guiding function. The student attempts to solve the problem instances generated by the teacher. We conclude with experiments demonstrating that TSC outperforms the current state-of-the-art Bootstrap Learning method in three representative benchmark domains and three guided search algorithms, with respect to the time required to solve all instances of the test set.more » « less
-
The A* algorithm is commonly used to solve NP-hard combinatorial optimization problems. When provided with a completely informed heuristic function, A* can solve such problems in time complexity that is polynomial in the solution cost and branching factor. In light of this fact, we examine a line of recent publications that propose fitting deep neural networks to the completely informed heuristic function. We assert that these works suffer from inherent scalability limitations since --- under the assumption of NP P/poly --- such approaches result in either (a) network sizes that scale super-polynomially in the instance sizes or (b) the accuracy of the fitted deep neural networks scales inversely with the instance sizes. Complementing our theoretical claims, we provide experimental results for three representative NP-hard search problems. The results suggest that fitting deep neural networks to informed heuristic functions requires network sizes that grow quickly with the problem instance size. We conclude by suggesting that the research community should focus on scalable methods for integrating heuristic search with machine learning, as opposed to methods relying on informed heuristic estimation.more » « less
-
To maximize indoor daylight, design projects commonly use commercial optimization tools to find optimum window configurations. However, experiments show that such tools either fail to find the optimal solution or are very slow to compute in certain conditions.This paper presents a comparative analysis between a gradient-free optimization technique, Covariance Matrix Adaptation Evolution Strategy (CMA-ES), and the widely used Genetic Algorithm (GA)-based tool, Galapagos, to optimize window parameters to improve indoor daylight in six locations across different latitudes. A novel combination of daylight metrics, sDA, and ASE, is proposed for single-objective optimization comparison. Results indicate that GA in Galapagos takes progressively more time to converge, from 11 minutes in southernmost to 11 hours in northernmost latitudes, while runtime for CMA-ES is consistently around 2 hours. On average, CMA-ES is 1.5 times faster than Galapagos, while consistently producing optimal solutions. This paper can help researchers in selecting appropriate optimization algorithms for daylight simulation based on latitudes, runtime, and solution quality.more » « less
-
We address a mechanism design problem where the goal of the designer is to maximize the entropy of a player's mixed strategy at a Nash equilibrium. This objective is of special relevance to video games where game designers wish to diversify the players' interaction with the game. To solve this design problem, we propose a bi-level alternating optimization technique that (1) approximates the mixed strategy Nash equilibrium using a Nash Monte-Carlo reinforcement learning approach and (2) applies a gradient-free optimization technique (Covariance-Matrix Adaptation Evolutionary Strategy) to maximize the entropy of the mixed strategy obtained in level (1). The experimental results show that our approach achieves comparable results to the state-of-the-art approach on three benchmark domains "Rock-Paper-Scissors-Fire-Water", "Workshop Warfare" and "Pokemon Video Game Championship". Next, we show that, unlike previous state-of-the-art approaches, the computational complexity of our proposed approach scales significantly better in larger combinatorial strategy spaces.more » « less
An official website of the United States government

Full Text Available